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Abstract 

We compute the loss of power in likelihood ratio tests when we test 
the original parameter of a probability density extended by the first 
Lehmann alternative. 



Distributions Generated by Lehmann Alter- 
natives 



In the context of parametric models for lifetime data, [Gu pta et alii 1998) 



disseminated the study of distributions generated by Lehmann alterna- 
tives, cumulative distributions that take one of the following forms: 

G 1 (x,X)= [F(x)} X or G 2 (x,\) = l-[l~F(x)} X (1) 

where F(x) is any cumulative distribution and A > 0. In the present note, 
we are going to call both G distributions generated distributions or extended 
distributions. It is easy to see that for integer values of A, Gi and G 2 are, 
respectively, the distribution of the maximum and the minimum of a sample 
of size A, the support of the two distribution is the same of F, and that the 
associated density functions are 

g 1 (x,X) = Xf(x)[F(x)] X - 1 and g 2 (x, X) = Xf(x) [1 — F(x)] X ^ 1 (2) 

where f(x) is the density function associated with F. Suppose that we 
generate a distribution G(x\X) based on the distribution F(x), and want to 
generate another distribution G'(x\X, A') repeating the process; It is easy 
to see that the distribution G' will be the same as G, for the new param- 
eter of the distribution, AA' may be summarized as a single one. This has 



the interesting side effect that the standard uniparametric exponential dis- 
tribution may be seen as a distribution generated by the second Lehmann 
alternative from the distribution F(x) = 1 — e~ x . 

To compute the moments of distribution generated by Lehmann alter- 
natives, we use the change of variables u = F(x) in the expression 



yielding 



E [X k \\] = r x k Xf(x) [Fix)]*- 1 dx (3) 

J — OO 

E [X k \\] = [ XQ^uju^du = £ B cta(A.l) [Q(u)} (4) 

Jo 

where Q(u) = is the quantile function. This integral is equivalent to 

the expectancy of Q(u) with respect to a Beta distribution with parameters 
a = \,0 = 1. The same reasoning can be used to show that, for the second 
Lehmann alternative, E \X k \X\ = ^Beta(i,A) 
Using the log-likelihood functions 

n n 

Gi(x,A) = nln(A)+yin/(.T J ) + (A-l)yinF(a; J ) (5) 



and 



G 2 (x, A) = nln(A) + yin/(a; J ) + (A-l)yin[l-F( 2 ; J )] (6) 



we see that the maximum likelihood estimators to the parameter A have 
the forms 

\ - n \ n 

a ZUiMi-F(x 3 )] (7) 

The existing literature about distributions generated by Lehmann al- 
ternatives concerns mostly distributions defined on the interval (0, oo) or in 
the real line, with the paper by I Nadarajah and Kotz 20061 being the more 



complete review of progresses and the paper ||Nadarajah 2006 1 being an in- 
teresting application of the concepts developed outside the original proposal 
by [Gupta et alii 19981, which was to analyze lifetime data. In the present 



paper, we are concerned with some information theoretical quantities of the 
first extension. These are not the only papers dealing with the subject, but 
a complete list with comments would be a paper on its own. 



2 Kullback-Leibler Divergence 

Given two probability density functions, the quantity defined as 

D KL (f\ 9 ) = lf(x)ln^)dx (8) 
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is called Kullback-Leibler Divergence (abbreviated DKL) after the authors 
of the classical paper [Kull back and Leibler 195 fl . Very often, this quantity 
is used as a measure of distance between two probability density functions, 
even though it is not a metric; This divergence measure clearly is greater 
or equal than zero, with zero occurring only and only if / = g, but it is 
not symmetric, so D KL (f\g) ^ D KL (g\f), and it does not obey the triangle 
inequality also. 

Rewriting equation ((8), we get 

[f(x)]n(^)dx = ( f{x)\n{f{x)) - f{x)\n{g{x))dx (9) 
Jr \9(x)J J r 

= E f [Hf(X))]-E f [Hg(X))} (10) 

where Ef [h{X)\ is the expectation of the random variable h(X) with respect 
to the probability density /. Since D K l (f\g) is greater than zero, we have 
that 

E f IHf(X))} > E f Mg(X))] (11) 

We will now show that maximizing the likelihood is equivalent to mini- 
mize D K l (/ |e), where e is the empirical distribution function. Calculating 
Dkl (/|e) we arrive at 

1 " 

D KL (f\e) = Ef[ln(f(X))]--J2Mf(xj,0)) (12) 

i=i 

where the rightmost term is the empirical log-likelihood multiplied by a 
constant. So, maximizing the rightmost term we minimize the whole diver- 
gence; Then the process of maximizing the likelihood is equivalent to mi- 
nimizing the divergence between the empirical density and the parametric 
model. This result is very common in the related literature, and is shown in 
full detail on sources like [Egu chi and Copas 1998) , which gives an accessi- 
ble but rather compact deduction of properties of methods based on Like- 
lihood Functions using DKL. In the next (and last) section we draw freely 
from a result shown in the [Eg uchi and Copas 19 981 paper that states that 
DKL might be used to measure the loss of power in likelihood ratio tests 
when the distribution under the alternative hypothesis is mis-specified. 



3 Wrong Specification of Reference Distribu- 
tion and Loss of Power in Likelihood Ratio 
Tests 

Suppose we have data from a probability distribution H(x\9, A), and want 
to test the hypothesis that (6 = 8 , A = A ). The usual log-likelihood ratio is 
expressed as 

A(A o ,0o) = ^ A 4 ) T (13) 
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where the notation £ is used for the unrestricted maximum likelihood es- 
timative of the parameter £. Suppose we are not willing to (or not able 
to) compute £(A, 9) because the estimative of the parameter A is trouble- 
some and decide to approximate the likelihood ratio statistic using £(Xi,9) 
instead of the likelihood under the alternative hypothesis, where 9 is the 
maximum likelihood estimator of 9 given that A = Ai. We have then the 
relation 



A result by [Eguchi and Copas 1998], section 3, states that the test statistic 



generated this way is less powerful than the usual one, with the loss in the 
power equal to 

A Powcr = D KL (/(x|A, 9), /(z|Ai,0)) (15) 

In the present paper, we are concerned with the case where the data 
follows a distribution extended with the first Lehmann alternative, where 
the original distribution is such that F = F(x\9) for a parameter 9. The null 
hypothesis will be of the form 

H Q :9 = 9 ,X = l (16) 

against a alternative hypothesis 

H A :9^9 ,X^1 (17) 

If we erroneously consider that the data doesn't come from a extended dis- 
tribution G(x\X,9), but from a population that follows the original F{x\9) 
distribution, we can say that we are approximating the log-likelihood un- 
der the alternative hypothesis like in the previous discussion. In this case, 
the log-likelihood will be taken under the hypothesis 

H A , :9^9 Q ,X=l (18) 

which generates the following expression for the log-likelihood: 

Then we have that the test has less power than the one using the full G 
distribution; The difference on the power of the tests is given by 

Apowcr = F>KL (g(x\\,0)\g(x\l,6)) (20) 

The main point in the above discussion is that for testing hypotheses about 
the "original" parameter £, the tests using the extended version of distribu- 
tions are always more powerful, with a considerable difference in the error 
type II rate. 
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Expanding the equation (|20|) we have that 



A P = D KL \g(x\\9)\g(x\l,9)) (21) 
g(xl XJ)ln( g -^M]d x (22) 



g(x\l,0) 

= [ xmU)F>-\ x \x,e)in (m'M^xM dx 

k Jy ' ' \ f(x\l,0) J 

= [ \f(x\\,8)F x - 1 (x\\,9)ln(\F x - 1 (x\\,9)) dx (24) 

= lnA+ / X(X-l)f(x\X,e)F x - 1 (x\X,9)]n.(F{x\X,6)) dx (25) 

Integrating by parts, we get 

Ap OW cr = In A H — ^ (26) 

The graphic of this function is the loss of power that we have on our test 
when we the distribution of our data is one extended by the first Lehmann 
alternative and we fail to notice that, and is depicted in Figure [l]for values 
of A bigger than one. 
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Figure 1: Loss of Power as a Function of A, for A > 1. 
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